34 research outputs found
AN ENTITY-CENTRIC APPROACH TO GREEN INFORMATION SYSTEMS
The integration of sustainable thinking and performance within day-to-day business activities has become an important business need. Sustainable business requires information on the use, flows and destinies of energy, water, and materials including waste, along with monetary information on environment-related costs, earnings, and savings. Creating this holistic view of economic, social and environmental information is not a straightforward mission from an IT perspective, and implies tackling several challenges such as information granularity and overload, the different projections of the same factual information, and the heterogeneity of information systems. In this paper, we propose an entity-centric approach to Green Information Systems to assist organisations in forming a cohesive representation of the environmental impact of their business operations at both micro- and macrolevels. Initial results from a Small Medium-size Enterprise case study are discussed along with future research directions
Automatic Anomaly Detection over Sliding Windows: Grand Challenge
With the advances in the Internet of Things and rapid generation of
vast amounts of data, there is an ever growing need for leveraging
and evaluating event-based systems as a basis for building realtime data analytics applications. The ability to detect, analyze, and
respond to abnormal patterns of events in a timely manner is as challenging as it is important. For instance, distributed processing environment might affect the required order of events, time-consuming
computations might fail to scale, or delays of alarms might lead
to unpredicted system behavior. The ACM DEBS Grand Challenge
2017 focuses on real-time anomaly detection for manufacturing
equipments based on the observation of a stream of measurements
generated by embedded digital and analogue sensors. In this paper,
we present our solution to the challenge leveraging the Apache
Flink stream processing framework and anomaly ordering based on
sliding windows, and evaluate the performance in terms of event
latency and throughput
Internet of Things Enhanced User Experience for Smart Water and Energy Management
Smart environments can engage a wide range of end users with different interests and priorities, from corporate managers looking to improve the performance of their business to school children who want to explore and learn more about the world around them. Creating an effective user experience within a smart environment (from smart buildings to smart cities) is an important factor to success. In this article, we reflect on our experience of developing Internet-of-Things-enabled applications within a smart home, school, office building, university, and airport, where the goal has been to engage a wide range of users (from building managers to business travelers) to increase water and energy awareness, management, and conservation
Technical Research Priorities for Big Data
To drive innovation and competitiveness, organisations need to foster the development and broad adoption of data technologies, value-adding use cases and sustainable business models. Enabling an effective data ecosystem requires overcoming several technical challenges associated with the cost and complexity of management, processing, analysis and utilisation of data. This chapter details a community-driven initiative to identify and characterise the key technical research priorities for research and development in data technologies. The chapter examines the systemic and structured methodology used to gather inputs from over 200 stakeholder organisations. The result of the process identified five key technical research priorities in the areas of data management, data processing, data analytics, data visualisation and user interactions, and data protection, together with 28 sub-level challenges. The process also highlighted the important role of data standardisation, data engineering and DevOps for Big Data
Loose coupling in heterogeneous event-based systems via approximate semantic matching and dynamic enrichment
There has been a significant change in the data landscape with the emergence of the Internet of Things (IoT). Tens of billions of devices are expected to connect to the Internet in the coming years within smart buildings, smart grids, smart cities, and cyber-physical systems. A basic requirement to realize the IoT is an infrastructure of sensing and communication solutions. Middleware systems, such as event processing, are also required to abstract the application developers from the underlying technologies.
Large-scale event processing environments are open, distributed, and heterogeneous in semantics and contexts. Interoperability is a key requirement and currently addressed by top-down granular agreements represented by ontologies and taxonomies for semantics. Such approaches are non-scalable, and achieving such agreements may be unfeasible under the characteristics of current and future event environments such as the IoT. This thesis analyses this problem using a decoupling versus coupling trade-o framework.
Event producers and consumers do not know each other and are decoupled in space, time, and synchronization to enable scalable deployments. They have boundaries that they have to cross in order to communicate with other systems. Such boundaries are syntactic, semantic, and pragmatic. Events are boundary objects that convey meanings signified by symbols. They must effectively cross the three levels of boundaries to establish interoperability and communication between event agents.
The current event processing paradigm is focused on crossing lower syntactic boundaries. Thus, human agents are needed in the loop to cross semantic and pragmatic boundaries through explicit agreements on event types, properties, values, and contexts, introducing coupling into these systems. Coupling limits the paradigm and contradicts the fundamental basis of decoupling for scalability. A trade-off can be concluded between decoupling for scalability and coupling for interoperability.
Space, time, and synchronization decoupling dimensions of event systems contribute to event transfer. I define two new types of problematic coupling dimensions: the semantic coupling and the pragmatic coupling. They correspond to granular and labour-intensive agreements on event semantics and contexts by humans involved in developing and using the event system. Such agreements may not be feasible in large-scale environments such as the IoT. Current approaches to semantic and context interoperability in event processing are coupled on one or more of these two dimensions, limiting scalability.
This thesis concerns two research questions of how semantic and pragmatic coupling can be loosened effectively and efficiently. I propose an approach based on four elements: subsymbolic semantics, free tagging, dynamic native enrichment, and approximation. A statistical vector-space model of semantics is built from a textual corpus that reflects the mutual understanding of event producers and consumers. Subscriptions are consumers' expressions to match events of interest. Free tags, called themes, are added to events and subscriptions to improve their meanings. Subscriptions are enhanced with indications of context to dynamically enrich events. Terms in events and subscriptions are decoded into their subsymbolic vector representations that are then matched using an approximate probabilistic matcher, resulting in scored relevance of events to subscriptions.
The hypotheses underlying the proposed approach are empirically validated within synthetic and real-world scenarios from the smart cities and energy management domains. A loose semantic coupling can be achieved with coarse-grained agreements on statistical semantics, with 100 approximate subscriptions compensating for 74,000 exact subscriptions otherwise needed. The approximate matcher achieves a magnitude of 1,000 events/sec of throughput, and an effectiveness of over than 95% F1Measure. Using thematic tagging, a lightweight amount of tags is needed: around 2-7 for events and 2-15 for subscriptions. It delivers a magnitude of 800 events/sec in the worst case and 85% F1Measure as opposed to 62% worst-case for non-thematic processing.
Loose pragmatic coupling is achieved with 4 high-level clauses in the subscriptions to guide the dynamic enricher. They specify the source, the retrieval method, the context search strategy, and the fusion method of events with context. Enrichment is instantiated
with spreading activation in Linked Data graphs. It is tested with 24,000 events, with live DBpedia, a structured version of Wikipedia, as a contextual source. It reaches an efficiency and effectiveness of 7 times more than other instantiations of the enricher.
The research discussed in this thesis has been deployed in working systems for energy and water management where it has had an impact on real world applications. The model has also been developed into the concept of thingsonomies, an architecture for
the Internet of Things that can tackle variety and allows IoT systems to evolve into large-scale, heterogeneous, and loosely coupled environments
Challenges with image event processing: Poster
There has been substantial research in the area of event processing
where systems are focused on event processing of structured data.
However, in the context of smart cities, signi cant number of realtime
applications for event-driven systems consist of image data,
rather than structured events. erefore, there is a need for a
system that can process multimedia events such as images. is
paper discusses challenges with processing images within eventbased
systems.is work was supported, by Science Foundation Ireland under grant
SFI/12/RC/2289 INSIGHT and grant 13/RC/2094.non-peer-reviewe
Feeling anxious? Perceiving anxiety in tweets using machine learning
This study provides a predictive measurement tool to examine perceived anxiety from a longitudinal perspective,
using a non-intrusive machine learning approach to scale human rating of anxiety in microblogs. Results suggest
that our chosen machine learning approach depicts perceived user state-anxiety fluctuations over time, as well as
mean trait anxiety. We further find a reverse relationship between perceived anxiety and outcomes such as social
engagement and popularity. Implications on the individual, organizational, and societal levels are discussed
Demo: Approximate Semantic Matching in the COLLIDER Event Processing Engine
This demo presents a use case from the energy management domain. It builds upon previous work on approximate semantic matching of heterogeneous events and compares two semantic matching scenarios: exact and approximate. It illustrates how a large number of exact matching event subscriptions are needed to match heterogeneous power consumption events. It then demonstrates how a small number of approximate semantic matching subscriptions are needed but possibly with a lower true positives/negatives performance. The demo is delivered via the COLLIDER approximate event processing engine currently under development in DERI
Word Re-Embedding via Manifold Dimensionality Retention
Word embeddings seek to recover a Euclidean metric space by mapping words into vectors, starting from words co-occurrences in a corpus. Word embeddings may underestimate the similarity between nearby words, and overestimate it between distant words in the Euclidean metric space. In this paper, we re-embed pre-trained word embeddings with a stage of manifold learning which retains dimensionality. We show that this approach is theoretically founded in the metric recovery paradigm, and empirically show that it can improve on state-of-the-art embeddings in word similarity tasks 0.5 - 5.0% points depending on the original space
Approximate Semantic Matching of Events for the Internet of Things
Event processing follows a decoupled model of interaction in space, time, and synchronization. However,
another dimension of semantic coupling also exists and poses a challenge to the scalability of event processing
systems in highly semantically heterogeneous and dynamic environments such as the Internet of
Things (IoT). Current state-of-the-art approaches of content-based and concept-based event systems require
a significant agreement between event producers and consumers on event schema or an external conceptual
model of event semantics. Thus, they do not address the semantic coupling issue. This article proposes an
approach where participants only agree on a distributional statistical model of semantics represented in a
corpus of text to derive semantic similarity and relatedness. It also proposes an approximate model for relaxing
the semantic coupling dimension via an approximation-enabled rule language and an approximate event
matcher. The model is formalized as an ensemble of semantic and top-k matchers along with a probability
model for uncertainty management. The model has been empirically validated on large sets of events and
subscriptions synthesized from real-world smart city and energy management systems. Experiments show
that the proposed model achieves more than 95% F1Score of effectiveness and thousands of events/sec of
throughput formedium degrees of approximation while not requiring users to have complete prior knowledge
of event semantics. In semantically loosely-coupled environments, one approximate subscription can compensate
for hundreds of exact subscriptions to cover all possibilities in environments which require complete
prior knowledge of event semantics. Results indicate that approximate semantic event processing could play
a promising role in the IoT middleware layer